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Abstract 

Skew-symmetric families of distributions such as the skew-normal and skew-t represent 
supersets of the normal and t distributions, and they exhibit richer classes of extremal be¬ 
haviour. By defining a non-stationary skew-normal process, which allows the easy handling 
of positive definite, non-stationary covariance functions, we derive a new family of max- 
stable processes - the extremal-skew-f process. This process is a superset of non-stationary 
processes that include the stationary extremal-t processes. We provide the spectral repre¬ 
sentation and the resulting angular densities of the extremal-skew-t process, and illustrate 
its practical implementation 
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1 Introduction 


The modern-day analysis of extremes is based on results from the theory of stochastic processes. 


In particular, max-stable processes (de Haan, 1984) are a popular and useful tool when modelling 
extremal responses in environmental, financial and engineering applications. Let S C denote 
a ^-dimensional region of space (or space-time) over which a real-valued stochastic process 


with a continuous sample path on S can be defined. Considering a sequence Yi,... ,Yn 


1 





of independent and identically distributed (iid) copies of Y, the pointwise partial maximum can 
be defined as 

Mn{s) = max Yi{s), s G S. 

If there are sequences of real-valued functions, an{s) > 0 and bn{s), for s G S and n = 1, 2,.. 
such that 

Mn{s) — bn{s) 'I ^ ^ 

-—- > ^{[/(sjjseS, 

(^n[S) j 

converges weakly as n —>• oo to a process I7(s} with non-degenerate marginal distributions for 


all s G S, then t/(s) is known as a max-stable process (de Haan and Ferreira, 2006, Ch. 9). In 


this setting, for a finite sequence of points {sj)j^i in §, where / = {1,..., d} is an index set, the 


finite-dimensional distribution of U is then a multivariate extreme value distribution (de Haan 


and Ferreira 2006, Ch. 6). This distribution has generalised extreme value univariate margins 


and, when parameterised with unit Frechet margins, has a joint distribution function of the form 

G{xj,j G I) = exp{-V{xj,j G /)}, Xj > 0, 

where Xj = x{sj). The exponent function V describes the dependence between extremes, and 
can be expressed as 

V{xj,jel)= / max{wj/xj)H{dwi,...,dwd), 

Jw fs/ 

where the angular measure H is a finite measure defined on the d-dimensional unit simplex 
W = {tc G : rci = 1}, satisfying the moment conditions f^Wj H{dw) = 1, j G /, 


(de Haan and Ferreira, 2006, Ch. 6). 

In recent years a variety of specific max-stable processes have been developed, many of which 


have become popular as they can be practically amenable to statistical modelling (Davison et al 


2012). The extremal-t process (Opitz 2013) is one of the best-known and widely-used max-stable 


processes, from which the Brown-Resnick process (Brown and Resnick 1977 Kabluchko et al. 


2009), the Gaussian extreme-value process (Smith 1990) and the extremal-Gaussian processes 


(Schlather, 2002) can be seen as special cases. In their most basic form, the Brown-Resnick 


and the extremal-t processes can be respectively understood as the limiting extremal processes 
of strictly stationary Gaussian and Student-t processes. However, in practice, data may be 
non-stationary and exhibit asymmetric distributions in many applications. In these scenarios. 


skew-symmetric distributions (Azzalini, 2013, Arellano-Valle and Azzalini, 2006, Azzalini, 2005 


Genton, 2004 Azzalini, 1985) provide simple models for modelling asymmetrically distributed 


data. However, the limiting extremal behaviour of these processes has not yet been established. 
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In this paper we characterise and deveiop statisticai modeis for the extremai behaviour of 
skew-normai and skew-t distributions. The joint taii behaviours of these skew distributions are 
capabie of describing a far wider range of dependence ieveis than that obtained under the sym¬ 
metric normai and t distributions. We provide a definition of a skew-normai process which is 
in turn a non-stationary process. This provides an accessibie approach to constructing positive 
definite, non-stationary covariance functions when working with non-Gaussian processes. Re- 
centiy some forms of non-stationary dependent structures embedded into max-stabie processes 


have been studied by Huser and Genton (2015). We show that on the basis of the skew-normai 


process a new famiiy of max-stabie processes - the extremai-skew-t process - can be obtained. 
This process is a superset of non-stationary processes that inciudes the stationary extremai- 


t processes (Opitz, 2013). From the extremal-skew-t process, a rich family of non-stationary, 


isotropic or anisotropic extremal coefficient functions can be obtained. 

This paper is organised as follows: in Section we first introduce a new variant of the 
extended skew-f class of distributions, before developing a non-stationary version of the skew- 
normal process. In both cases we discuss the stochastic behavior of their extreme values. In 
Sectionwe derive the spectral representation of the extended extremal skew-f process. Section 
[^discusses inferential aspects of the extremal skew-t dependence model, and Sectionprovides 
a real data application. We conclude with a Discussion. 

2 Preliminary results on skew-normal processes and skew-t dis¬ 
tributions 


We introduce two preliminary results that will be used in order to present our main contribution 
in Section]^ the extremal-skew-t process. In Section 2.1 we define the non-central extended 
skew-t family of distributions, which is a new variant of the class introduced by Arellano-Valle 


and Genton (2010), that allows a non-centrality parameter. In Section 2.2 we present the 


development of a new non-stationary, skew normal random process. 

Hereafter, we use Y ~ '^>^{ 01 , 62 ,...) to denote that V is a d-dimensional random vector with 
probability law V and parameters 61 , 62 , ■■ ■■ When d = 1 the subscript is omitted for brevity. 
Similarly, when a parameter is equal to zero or a scale matrix is equal to the identity (both in 
a vector and scalar sense) so that reduces to an obvious sub-family, it is also omitted. 
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2.1 The non-central, extended skew-t distribntion 


While several skew-symmetric distributions have been developed (see e.g., Genton, 2004, Azza- 


lini, 2013), we focus on the skew-normal and skew-t distributions. 


Denote a d-dimensional skew-normally distributed random vector by Y r\j 


(Arellano-Valle and Genton, 2010). This random vector has probability density function (pdf) 

( 1 ) 




\/i + Qn(“)} 

where (pdiu', D) is a d-dimensional normal pdf with mean /x G and dx d covariance matrix 
Q, z = {y — fi)/uj, uj = diag(D)^/^, D = Qni^) — and 4>(-) is the standard 

univariate normal cumulative distribution function (cdf). The shape parameters a G and 
T G M are respectively slant and extension parameters. The cdf associated with Q is termed the 


extended skew-normal distribution (Arellano-Valle and Genton, 2010) of which the skew-normal 


and normal distributions are special cases ( jArellano- Valle and Genton 2010, Azzalini, 2013). 
For example, in the case where a = 0 and r = 0 the standard normal pdf is recovered. 

Definition 1. V is a d-dimensional, non-central extended skew-t distributed random vector, 
denoted by V ~ 57d(^, D, a, r, k, v ), if for y G it has pdf 

V'dCy; D, zx) 






\/l+Qn(“) ’ 's/^+Qn(°‘) ’ 




( 2 ) 

where ?/;rf(y;/x, D, z/) is the pdf of a d-dimensional t-distribution with location /x G d x d 
scale matrix D and u G M"*" degrees of freedom, 4'(-; a, ly) denotes a univariate non-central t cdf 
with non-centrality parameter a G M and u degrees of freedom, and Qq-i{z) = The 

remaining terms are as defined in Q. The associated cdf is 

^d+i{Y, Q*,k*,u} 


Td(y;/x,D,a,r, K,z^) = 


(3) 


where z = {z~^ is a (d -|- l)-dimensional (non-central) t cdf with covariance matrix 

and non-centrality parameters 



and u degrees of freedom, and where 

6 = {I + Cla, R, = {1 + k, f = {1 + t. (4) 
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When the non-centrality parameter k is zero, then the extended skew-t family of Arellano- 


Valle and Genton (2010) is obtained. For the non-central skew-t family, we now demonstrate 


modified properties to those discussed in Arellano-Valle and Genton (2010) 


Proposition 1 (Properties). Let Y r-u STd{ti-,LL,a,T,K,v). 


1. Marginal and conditional distributions. Let I C d} and I = {l,...,d}\I iden¬ 

tify the di- and dj-dimensional subvector partition of Y such that Y = {YJ,YJwith 
corresponding partitions of the parameters (/r, 11, a). Then 


(a) Yi ^ where 


a} = 






( 5 ) 


gzven — ^7/^77 ^77* 

(b) {Yj\Yi = yj) ~ STdj{lJ-i.i,Llf.j,aj.j,Tj.j, where p,j.j = /rj + LljjQjj{yj — 

Ti), %7 = Cl = {^ + + di), zi = LoJ^ivi - Hi), uji = 

diag{ojjj)^^‘^, = zjVtjjZj, = VLjj — LtjjLljjLljj, aj,j = udj.^LOj^aj, 

uij.i = diag{Lljj.jy/‘^, uj = diag{ujjY/‘^ , tj.j = ^77^77^ + a])zi + t], 

Kj,j = and upj = u -\- dj. 


2. Conditioning type stochastic representation. We can write Y = y, -\- LIZ, where Z = 
(A|a''~A + r > Aq), and where X ~ Td{Ll,v) is independent of Xq ~ T{k,v). 

3. Additive type stochastic representation. We can write Y = y-\-LlZ, where Z = \J Ai + 
SXq, Xi ~ 7d(Ll — dd"'", K,iy + 1) is independent of Xq = (Ao| Aq + r > 0), Aq ~ T(k, v), 
5 G (—1, l)*^ and where f and R are as in (|^. 


Proof in Appendix A. 1 


We conclude by presenting a final property of the non-central skew-t family. The next result 
describes the extremal behaviour of observations drawn from a member of this class. 


Proposition 2. Let Zi,... ,Zn be iid copies of Z ^ STd{Ll, a, r, k, v) and be the componen¬ 
twise sample maxima. Define an = {any, •.., an,d)~^, where 

n{r(i./2)}-ir{(z. + 1)/2 }i/('^- 2)/2 T(a*^A7Tl; n,v + l) 

{jj /{^ + 
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where a* = t* = and k* = are the marginal parameters under Proposition 

m- Then Mn/an => ?7 as n ^ +oo, where U has univariate v-Frechet marginal distributions 
(i.e. e~^ , x>{)), and exponent function 

\ \ T 


V{xjJ G /) = 



n+l 


X: 


Xa 


- OJi 


I > 


i G F 




( 6 ) 


where 'I'd-i is a {d — 1)-dimensional central extended skew-t distribution with correlation matrix 
shape and extension parameters aj and , and v + 1 degrees of freedom, I = {1,..., d}, 
Ij = I\{j}, and ujij is the {i,j)-th element of Cl. 

Proof (and further details) in Appendix \A.^ 

As the limiting distribution ® is the same as that of the classic skew-t distribution (see 


Padoan, 2011), it exhibits identical upper and lower tail dependence coefficients (e.g. Joe, 1997 


Ch 5). That is, the extension and non-centrality parameters, r and n, do not affect the extremal 
behavior. 


2.2 A non-stationary, skew-normal random process 


While there are several dehnitions of a stationary skew-normal process (e.g. Minozzo and Fer- 


racuti, 2012), stationarity is incompatible with the requirement that all hnite-dimensional dis¬ 


tributions of the process are skew-normal. We now construct a non-stationary version of the 


skew-normal process through the additive-type stochastic representation (e.g. Azzalini, 2013 


Ch. 5). A similar approach was explored by Zhang and El-Shaarawi (2010) for the stationary 


case. 


Definition 2. Let {A(s)}sgs be a stationary Gaussian random process on S with zero mean, 
unit variance and correlation function p{h) = E{A(s)A(s -|- h)} for s G S and h G For 
X' ~ 1) independent of Ar(s), e G M and a function J : S i—)■ (—1,1), define 

X''{s) := X'\X' + £>t), VsGS 
Z{s) := V1-<5(s)2A(s) +J(s)A"(s), s G S. (7) 

Then Z{s) is a skew-normal random process. 

We refer to 6{s) as the slant function. From if (5(s) = 0 for all s G S, then Z is 
a Gaussian random process. Note that Z is a random process with a consistent family of 
distribution functions, since Z{s) = a{s)X{.s) + b{s)Y{s) where a and b are bounded functions 
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and X and Y are random processes with a consistent family of distribution functions. For any 
finite sequence of points si,..., G S the joint distribution of Z{si ),..., Z{sd) is 5A/d(fi, a, r), 
where 


= Ds{t + {Dj^5){Dj^6)'^)Ds 

a = {1 + (8) 


and where S is the d x d correlation matrix of X, <5 = ((5(si),..., (^(sd))"*" and Ds = {Id — 
diag(J^)}^/^, where is the identity matrix (Azzalini, 2013, Ch. 5). As a result, for any lag 


h G the distributions of {Z{si ),..., Z{sd)'\ and {Z{si + h),..., Z{sd + h)} will differ unless 
(5(s) = 0 for all s G S. Hence, the distribution of Z is not translation invariant and the process 
is not strictly stationary. For s G S and /i G the mean m{s) and covariance function Cs{h) 
of the skew-normal random process are 


m{s) = E{Z(s)} = 5(s)0(e)/<l>(e) 


and 


Cs{h) = Co\{Z{s), Z{s + h)] = p{h)^/{l - (52(s)}{l - ( 52 (s + h)] + 5{s)5{s + h){l - r), (9) 

where r = • Hence, the mean is not constant and the covariance does not 

depend only on the lag h, unless 6{s) = 6o £ (—1,1) for all s G S. In the latter case the 


skew-normal random process is weakly stationary (Zhang and El-Shaarawi, 2010). 

One benefit of working with a skew-normal random field is that the non-stationary covariance 
function ([^ is positive definite if the covariance function of X is positive definite, and if — 1 < 
5(s) < 1 for all s G S. Hence, a valid model is directly obtainable by means of standard 
parametric correlation models p{h) and any bounded function d in (—1,1). If the Gaussian 
process correlation function satisfies p(0) = 1 and p{h) —)• 0 as ||/i|| —)■ -|-cx), then the correlation 
of the skew-normal process satisfies Ps{0) = 1 and 

(h) - h){l - r) 

^/cs{0)cs{h) ^/{l- (52(s)r)(l - -F h)r) ’ 

as ||/i|| —)> -|-oo. Hence Ps{h) = 0 if either 6{s) or S{s + h) are zero. Conversely, if both 6{s) —>■ ±1 
and 6{s -|- /i) —)• ±1 then ps{h) —)• ±1. 

The increments Z{s + h) — Z{s) are skew-normal distributed for any fixed s G S and /i G 


(see Azzalini, 2013, Ch. 5) and the variogram 2'^s{h) = Var{Z(s + h) — Z(s)} is equal to 

5'^{s + h) + 


27 ,(/i) = 2 l-c,(/i)-' 


2/r 
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When h = 0 the variogram is zero, and when ||/i|| —>■ +oo the variogram approaches a constant 
< 2, respectively resulting in spatial independence or dependence for large distances h. We can 
now infer the conditions required so that Z{s) has a continuous sample path. 


Proposition 3. Assume that S C M. ^ skew-normal process {Z{s),s G S} has a continuous 
sample path if S(s -j- h) — S(s) = o(l) and 1 — p{h) = 0{\ log |/i| |““) for some a > 3, as h ^ 0. 


This result follows by noting that rs{h) = p{h) + (5^(s)(l — p{h)) + o(l) as /i —)• 0 and this is a 
consequence of the continuity assumption on <5(s), where rs{h) = Cs{h) + r{(5^(s + /i) + (5^(s)}/2. 
Therefore, 1 — rs{h) = 0(| log |/i| |““) as /i —?• 0. Thus, the proof follows from the results in 


Lindgren (2012, page 48). This means that continuity of the skew-normal process is assured if 


(5(s) is a continuous function, in addition to the usual condition on the correlation function of 


the generating Gaussian process (e.g. Lindgren, 2012, Ch. 2). 

Figure [^illustrates trajectories of the skew-normal process for A: = 1, with X{s) a zero mean 
unit variance Gaussian process on [0,1] with isotropic power-exponential correlation function 


p(/i; 'd) = exp{— (L/A)^}, -d = (A, ^), A > 0, 0 < ^ < 2, h > 0, (10) 


with ^ = 1.5, A = 0.3 and h G [0,1]. The first row shows the standard stationary case. The 
second row illustrates the non-stationary correlation function obtained with s = 0.1 (solid line) 
behaving close to the stationary correlation, however decaying more slowly as s increases and 
approaching, but not reaching zero exactly. The third row demonstrates both that points may 
be negatively correlated and that ps{h) is not necessarily a decreasing function in h. The 
bottom row highlights this even more clearly - correlation functions need not be monotonically 
decreasing - implying that pairs of points far apart can be more dependent than nearby points. 

Simulating a skew-normal random process is computationally cheap through Definition 
with the simulation of the required stationary Gaussian process achievable through many fast 


algorithms (e.g.. Wood and Ghan, 1994 Ghan and Wood, 1997). Rather than relying on (j^, 
for practical purposes, to directly simulate from a skew-normal process with given parameters 


a, D and r, a conditioning sampling approach can be adopted (Azzalini, 2013, Gh. 5). 

Specifically, let X (s) define a zero-mean, unit variance stationary Gaussian random field on 
S with correlation function oj[h) = E{X(s)A(s -|- h)'\ and let D be the d x d correlation matrix 
of X(si),... ,X(srf). Specify a : S i—)• M to be a continuous square-integrable function and let 
{a,X) = fga(s}X(s} ds be the inner product. Let X' be a standard normal random variable 
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Figure 1: Simulations from four univariate skew-normal random processes on [0,1] with e = 0. The left 
column shows the sample path (solid line) of the simulated process Z{s) and of the generating Gaussian 
process X(s) (grey line). The middle column illustrates the slant function 6{s) (solid line) and the mean 
m(s) of the process (dashed line). The right column displays the non-stationary correlation functions at 
locations s = 0.1 (solid line), 0.5 and 0.75 (dot-dash). Rows 1-3 use slant function 6{s) = asin(&s) with 
a = 0.95 and 6 = 0,1 and 3 respectively, whereas row 4 uses S{s) = sin(6s) cos(6s) with a = 1.3 and 
6 = 0.9. 


( 11 ) 


independent of X and r G M. If we define 

Z{s) = {X{s)\{a,X)> X' -t] , sES 

then, for any finite set si,..., E §, the distribution of Z{si ),..., Z{sd) is SM{^, a, r), where 
a = {q;(si), ..., Q!(srf)}. For simplicity we also refer to q;(s) as the slant function. More efficient 
simulation of skew-normal processes can be achieved by considering the form .^(s) = X(s) if 


(a,X) > X' — T and Z{s) = —X{s) otherwise (e.g. Azzalini, 2013, Ch. 5). 

We conclude this section by discussing some extremal properties of the skew-normal process 
Z{s). For a finite sequence of points si,...,S(i E S, with d > 2. Each margin Z{si) follows 
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a skew-normal distribution (Azzalini, 2013) and so is in the domain of attraction of a Gumbel 


distribution (Chang and Genton, 2007, Padoan, 2011). Further, each pair (Z(si), Z(sj)} is 


asymptotically independent (Bortot, 2010, Lysenko et ah, 2009). However, in this case a broad 


class of tail behaviours can still be obtained by assuming that the joint survival function is 


regularly varying at -|-oo with index — 1/r/ (Ledford and Tawn, 1996), so that 

Pr(Z(si) > X, Z(sj) > x) = ^(x), X —>■+ 00 , (12) 

where r] G (0,1] is the coefficient of tail dependence and ^{x) is a slowly varying function i.e., 
{ax')/ ^ {x) — )• 1 as X —)■ -|-oo, for fixed a > 0. Considering .if as a constant, at extreme levels 
margins are negatively associated when r/ < 1/2, independent when r] = 1/2 and positively 
associated when 1/2 < 77 < 1. When r/ = 1 and .if(x) ^ 0 asymptotic dependence is obtained. 


We derive the asymptotic behavior of the joint survival function (12) for a pair of skew-normal 


margins. As our primary interest is in spatial applications, we focus on the joint upper tail of 
the skew-normal distribution when the variables are positively correlated or uncorrelated. 

Proposition 4. Let Z ~ 5 A/ 2 (^, ct); where a = (oi, 02 )"'" o-nd Cl is a correlation matrix with off- 
diagonal term uj G [0,1). The joint survivor function of the bivariate skew-normal distribution 


with unit Frechet margins behaves asymptotically as (12), where: 

1. when either oi, 02 > 0 , or uj > 0 and aj < 0 and a^-j > —uj~^aj for j = 1 , 2 , then 

r] = {1 -\- uj)/2, ^{x) = ^ (dvr log x)~^/; 

2. when a; > 0, aj < 0, and —ujOj < a^-j < —uj~^aj, for j = 1,2, then 


(a) If a^-j > —aj/aj then 

(b) If a^-j < —aj/dtj then 


■^(^) = (ff)(i-4) (4vrlogx)V2^ 1 ; 


r] = 




+ 03-7 + 7^ 


-1 


.if(x) = 7 ^— 

V J {oLj-. 




■ (dvr log 


jLj-u}){l-ojaj+aj{aj+a3-jaj){l-oj^)} 

3. when either ai, 02 < 0, or uj > 0, Uj < 0 and 0 < os-j < —uj aj for j = 1, 2, then 


V = 


1 


l-a ;2 




"3-7 


)(1—a;^)+l 2{a3_jaj(l—uj'^)—uj) 


_ 3 - 7 V--^ ^ -h 


'W 


OLi-jOLj 


-1 


jx(x) = 
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where Oj = ^/l + a*^ and a*, := a*r-i = 3 

^ \ ' 1 3 b} ^l+Q3_j(l-d^2) 

Proof in Appendix 1^.4 

As a result, when both marginal parameters are non-negative (case 1) then 1/2 < 77 < 1 , 
with r] = lj2 occurring when w = 0. As a consequence, as for the Gaussian distribution (for 
which a = 0 ) the marginal extremes are either positively associated or exactly independent. 
The marginal extremes are also completely dependent when a; = 1, regardless of the values of 
the slant parameters, a. When one marginal parameter is positive and one is negative (case 2) 
then r] > {1 + uj)j2. In this case the extreme marginals are also positively associated, but the 
dependence is greater than when the random variables are normally distributed. Finally, when 
both marginal parameters are negative (case 3), then 0 < 77 < 1/2 implying that the extreme 
marginals are negatively associated, although a; > 0. It should be noted that differently from 
the Gaussian case (a = 0) where a; > 0 implies a positive association, in this case it is not 
necessarily true. In summary, the degree of dependence in the upper tail of the skew-normal 
distribution ranges from negative to positive association and including independence. 


3 Spectral representation for the extremal-skew-t process 


The spectral representation of stationary max-stable processes with common unit Frechet mar¬ 


gins can be constructed using the fundamental procedures introduced by de Haan (1984) and 


Schlather (2002) (see also de Haan and Ferreira 2006, Gh. 9). This representation can be for¬ 


mulated in broader terms resulting in max-stable processes with iz-Frechet univariate marginal 


distributions, with n > 0 (Opitz, 2013). In order to state our result we rephrase the spectral 


representation so to also take into account non-stationary processes. 

Let {y(s)}sg§ be a non-stationary real-valued stochastic process with continuous sample 
path on S such that E{sup^ggy(s)} < 00 and = E[{y+(s)}'^] < 00 ,Vs G S for 17 > 0, 

where T'''(-) = max{y(-),0} denotes the positive part of Y. Let {Ri}i>i be the points of 
an inhomogeneous Poisson point process on (0, 00 ) with intensity n > 0, which are 

independent of Y. Define 


U{s) = max {RiY-'^(s)} / {m'^ , s G S, 
2 = 1 , 2 ,... 


(13) 


where Yi^Y2,... are iid copies of Y. Then [/ is a max-stable process with common Z7-Frechet 
univariate margins. In particular, for fixed s G S and x{s) > 0 we have 

E{y+(s)}^ 


Pr(f7(s) < x(s)) = exp 


x'2(s)?tt,+ (s) 


= exp{-l/x''(s)}, 
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and for fixed si,... ,Sd the finite dimensional distribution of U has exponent function 

■{Y+{sj)/xis,)y 


V (x(si),..., x(sd)) = IE ( max 

V J 


m+(sj) 


, x(sj) > 0, j = 1,... ,d (14) 


(de Haan and Ferreira, 2006, Ch. 9). 

In this construction, the impact of a non-stationary process Y(s) would be that the de¬ 
pendence structure of the max-stable process [/(s + h) depends on both the separation h and 
the location s G S, and would therefore itself be non-stationary. The below theorem derives a 


max-stable process I7(s) when T(s) is the skew-normal random field introduced in Section 2.2 


Theorem 1 (Extremal skew-t process). Let Y(s) he a skew-normal random field on s € E> 
with finite dimensional distribution 5A/’rf(i2, a, r), as defined in equation 0- Then the max- 


stable process U{s), given by (13), has v-Frechet univariate marginal distributions and exponent 
function 


V{xj,j e I) = Y^ 
i=i 


= > Xj’^^d-l 



1^ + 1 


1 — a;, 


- LOij \ fielj] ; 0°, a°, r°, iY:,v + l\ , (15) 




where xj = x{sj), 'ifd-i is a {d — 1)-dimensional non-central extended skew-t distribution (Defi¬ 
nition^ with correlation matrix Dy shape, extension and non-centrality parameters and 

K°, V + 1 degrees of freedom, / = d}, Ij = and tVij is the {i,j)-th element of O. 


Proof (and further details) in Appendix A.4 


We call the process U{s) with exponent function (15) an extremal skew-t process. 


Note that in Theorem when r = 0, and the slant function is such that q;(s) = 0 for all 


s G S, then the exponent function (15) becomes 


E(x,-,iGl) = 

16/ 


d-l 


1 — UJ. 




T 


- ] A ^ y \ ; + 1 


(16) 


This is the exponent function of the extremal-t process as discussed in Opitz (2013). 


If we assume r = 0 in (11), then the bivariate exponent function of the extremal skew-t 


process seen as a function of the separation h is equal to 

\ + 1 ) , '^iKxtih))-,a+{h),T+{h),u + l) 

Vix[s),x[s + n)i - + x-is + h) 

where T is a univariate extended skew-t distribution, b{ ) = 


12 
































Figure 2: Examples of univariate (fc = 1) non-stationary isotropic extremal coefficient functions Osih), 


for the extremal skew-t process over s G [0,1], using correlation function (10) where G [0,1], A = 1.5 
and 7 = 0.3. Slant functions are (left to right panels): 0 ( 3 ) = —1 — s + exp{sin(5s)}, q;(s) = 1 + 1.5s — 
exp{sin(8s)} and q;(s) = 2.25 sin(9s) cos(9s). Solid, dashed and dot-dashed lines represent the fixed 
locations s = 0.05,0.25 and 0.8 respectively. 


x*(h) = 

s'- ' x(s) 


x+{h) = 


x{s+h)rs{h) ’ 


a 


:{h) = a{s + h) , 


af{h) = a(s) 


T'sW — \YY^Yl{a{s) + a{s + h)uj{h)}, t+(/i) = + h) + a{s)uj{h)}, 


and 




Tsih) = 


ais) + a{s + 


u + l 


\/v 




a(s + h) + a{sMh)J + 1 


Clearly, as the dependence structure depends on both correlation function a;(/i) and the slant 
function q;(s), and therefore on the value of s G S, it is a non-stationary dependence structure. 
From the bivariate exponent function we can derive the non-stationary extremal coefficient 
function, using the relation 9s{h) = 1^(1,1), which gives 


Osih) = ^ib(TYh)y,a:{h),T:ih), u + l) + 'h(6(l/F,(h)); a+(h), r+(h), iy + 1). (17) 

Figurej^shows some examples of univariate {k = 1) non-stationary isotropic extremal coeffi¬ 
cient functions obtained from © using the power-exponential correlation function ( |10[ ). Each 
panel illustrates a different slant function a(s), with the line-types indicating the fixed location 
value of s G S. The extremal coefficient functions 9s{h) increase as the value of h increases, 
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Figure 3: Bivariate {k = 2) geometric anisotropic non-stationary extremal coefficient functions Osih), 
for the extremal skew-t process on s G [0,1]^, based on extremal coefficient function ( [TtI ) with A = 1.5 
and 7 = 0.3, where h = Rv, v = (ui,U 2 )^ G [—1,1]^ and i? is a 2 x 2 matrix whose diagonal elements 
are 2.5 and off-diagonal elements 1.5. Slant functions are 0 ( 3 ) = exp{sin(4si) sin(4s2) ~ S 1 S 2 — 1} (top 
panels) and a{s) = 2.25{sin(3si) cos(3si) -|- sin(3s2) cos(3s2)} (bottom), with s = (si,S 2 )^ G [0,1]^. Left 
to right, panels are based on fixing s = (0.2, 0.2)^, s = (0.4,0.4)^ and s = (0.85,0.85)^ (top panels) and 
s = (0.25,0.25)^, s = (0.25, 0.8)^ and s = (0.8, 0.8)^ (bottom). 

meaning that the dependence of extremes decreases with the distance. 9s{h) grows with differ¬ 
ent rates depending on the location s G S. Although the ergodicity and mixing properties of the 
process must be investigated, numerical results show that for some s, 6 s{h) —>■ 2 as |/i| —>■ -|-oo. 
By increasing the complexity of the slant function (e.g. centre and right panels) it is possible to 
construct extremal coefficient functions which exhibit stronger dependence for larger distances, 
h, compared to shorter distances. Similarly Figure illustrates examples of bivariate {k = 2) 
non-stationary geometric anisotropic extremal coefficient functions, 0 s{h), also obtained from 
Similar interpretations to the univariate case can be made (Figure [^, in addition to 
noting that the level of dependence is affected by the direction (from the origin). 
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4 Inference for extremal skew-t processes 


Parametric inference for the extremal-skew-t process can be performed in two ways. The first 


uses the marginal composite-likelihood approach (e.g. Padoan et ah, 2010 Davison and Gho- 


lamrezaee, 2012 Huser and Davison, 2013), since only marginal densities of dimension up to 


d = 4 are practically available (see the Supporting Information). 

Let 1 ? G 0 C W, p = 1,2,..., denote the vector of dependence parameters of the extremal- 
skew-t process. Consider a sample {xi,i = l,...,n) with Xi G of n iid replicates of the 
process observed over a finite number of points {sj,j G I) with Sj G S. For simplicity, it is 
assumed that the univariate marginal distributions are unit Frechet. The pairwise or triplewise 
(m = 2,3) log-composite-likelihood is defined by 

n 

£m{'&-,x) = '^ ^ log f{xi € E;!}), m = 2,3, 

i=\ E^Em 

where x = {xi,... ,Xn)~^ with Xi G and / is a marginal extremal-skew-t pdf associated 


with each member of a set of marginal events Em- See e.g. Varin et al. (2011) for a complete 
description of composite likelihood methods. 


A second approach is to use the approximate likelihood function introduced by [Coles arid 


Tawn (1994), which is constructed on the space of angular densities. The angular measure of the 


extremal-skew-t dependence model (15) places mass on the interior as well as on all the other 


subspaces of the simplex, such as the edges and the vertices. We derive some of these densities 


following the results in Coles and Tawn (1991). 


Let J be an index set that takes values in I = IP({1,..., d})\0, where 1P(/) is the power set 
of I. For any fixed d and all J G I, the sets 

= (tc G W : rcj = 0, if j ^ J; Wj > 0 if j G J) 

provide a partition of the d-dimensional simplex W into 2^^ — 1 subsets. Let fc = | J| be the size of 
J. Let hd^j denote the density that lies on the subspace Wd,j, which has k — 1 free parameters 
Wj such that j G J. When J = {1,..., d} the angular density in the interior of the simplex is 




h{'w) = 


v+l 


lA 


- cjip > , f G Ii 




w 


(d-l-l) 


n 


1 / u+l I K rn^ 




w gW (18) 


where ipd-i denotes the d — 1-dimensional skew-t density, Ij = d}\j and where the 

parameters and w° = are given in the proof to Theorem (Appendix 
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A.4). When J = {zi,..., i^} C {1,..., d}, the angular density for any x G is 


dd,. 


X. 


n 






k+1 


E 


Xi 


lim 


d’^V 


Xj^o, dxi^ ■ ■ ■ dxi^^ 


(x). 


(19) 


\jgj 

Thus, when J = {j} for any j G {1,... ,d} then Wdj is a vertex ej of the simplex and the 


density is a point mass, denoted hdj = H{{ej}). In this case (19) reduces to 

hd,j = ^d-i { ( ~J ^ “T ^ 1 


, 


( 20 ) 


where denotes the d — 1-dimensional skew-t distribution with parameters again given in 

the proof to Theorem (Appendix |A.4[ ) . 

Computations of all 2“^ — 1 densities that lie on the edges and vertices of the simplex are 
available for d = 3. In this case, the angular densities on the interior and vertices of the simplex 


can be deduced from (18) and (20). For all i,j G J = {1,2,3}, with i ^ j, the angular density 
on the edges of for w G is given by 


/i3,{i,i}(w^) = 




+ 1) 
4'(f„; u + l) 


^2 


{Viiu, v),y°2{u, u)}"^ ; 0°°, u + 2 


1 j d6° ( d6° {u + 2)6° 

X — < ^- 7 ^-h ' 


1 

w 1 1 dwudwv dt/;.y V dwu zz -|- 1 -|- b°\ w i 




I (7+2 K,v(^u,k + + ^) 

i-Si3i,2i ix + i + Klf'^ 




X T 


V^ + 3 (u, w)I2°°[i^] - zl (n, ?z)0°°[^ 2]} 




z]u-\-3 


( 21 ) 


\ Y 1 + zf{u, '(;)J det(0°°) y 

U + 2 X{U, V)fu + ^*^[ 1 , 3 ] {i' + 1 ) 


+ 'tp{y2iu,v)-,i^ + 2} 


1 - n 




xT < 


\47T3 {^l°(w,^^)^^°°[2,2] - ^2(^>^)^n°[l,2]} 


^«?[ 2 , 2 ]{^ + 1 + K%} + Z 2 {u, det(II° 




:; -|- 3 


where for all u,v G J, with tt / u, and k ^ {z, j j. 


yi[U,V) = ^=1,2, 2{(U,U) 


u,[£,i] 


CiiT} — (jJti 


^ + 1 Of \ - r)° 12 + 1 

-^, 2:2 [u, v)=Tu- J 1 _ 3 ], 6„_^ = /- 




1 - ^Iv 


1 + 


UJ'ii 
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= 


-6J 1 


= diag(i7: 


oo\ _ O® 


, c (a„yr^<,a,yr^^) , n°°= 

“ ^u [- 11 ]^« [1 -!]■ Components of and 0°° are respectively 


given by ^ '^’ Appendix A.4 for further details. When, 


T = 0 and q;(s) = 0, then the densities (18), (20) and (21) reduce to the densities of the 


extremal-t dependence model. A graphical illustration that shows the difference between the 
two dependence models is provided in the Supporting Information. 

Therefore, for d = 3 the estimation of dependence parameters can be based on the following 
approach. Let {{ri,Wi) : i = l,...,n} be the set of observations, where r* = 

Wi = Xi/ri, with Xi = {xij)j^j, are pseudo-polar radial and angular components. Then the 
approximate log-likelihood is defined by 


i{'&-,w)= ^ logh{wi-,d), 


( 22 ) 


2=1,...,n: 

ri>ro 


where w = (rci,..., Wn) , for some radial threshold tq > 0, and where h is the angular density 


function of the extremal-skew-t dependence model. The components of the sum in (22) comprise 


the three types of angular densities lying on the interior, edges and vertices of the simplex. 
Whether an angular component belongs either to the interior, an edge or a vertex of the simplex, 
producing the associated density, is determined according the following criterion. We select a 
threshold c G [0,0.1] and we construct the following partitions for an arbitrary observation 
Wi = {wij,Wi^k,Wi^i), i = 1,... ,n. Set w = Wi for simplicity. When Cj = {wj > 1 — c; j £ 1} 
then an observation belongs to vertex gj. WhenTj_fc = {wj,Wk < l—c,wi < c,Wj > l—2wk,Wk > 
1 — 2wj;j G I,k G Ij, I G /j\{A:}}, then an observation belongs to edge between the jth and feth 
components. When X = {wj > c;j G 1} then an observation belongs to the interior (see the 
Supporting Information for more details). The components of the angular density h{w) then 
require rescaling so that they satisfy the constraints of valid angular densities - namely that 
they integrate to the number of components of w (3 in the trivariate case) - while also respecting 
the partition of W implied by c. Without this rescaling then the likelihood of e.g. the model 
that places mass on all subsets of the simplex is not comparable with that of models that places 
mass only on subsets of the simplex. Specifically 

/ h{w)dw = Kc^ V fc}(u;)du;-F iLx / /i3^|i^2,3}(^^)dw^ = 3, 

Jw Jc, 2 

k=j+l,3 
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where 


^4 2 h3^{j^k}Hdw 


Id fd ^3,{i,2,3}Mdw 


Kx = 


rl — 2c rl — 2c i i' \A ^ 

Jc Jc ^3,{l,2,3}Mdw 


c\/3(l - 2c) 

and h^yj, and /i 3 ,{i, 2 , 3 }(^) are defined above. Note that for j,k ^ I with j / k, we 

have that the bivariate case {d = 2), the appropriate modification 

only considers the mass on the vertices and interior. 

We now illustrate the ability of the approximate likelihood in estimating the extremal de¬ 
pendence parameters in the bivariate and trivariate cases. We generate 500 replicate datasets 
of sizes 5000 (bivariate) and 1000 (trivariate), with parameters '&2 = (w, i^) = (0.6,1.5) and 
??3 = (a;i^ 2 ; wi^ 3 , u; 2 , 3 , z^) = (0.6,0.8,0.7,1). Each dataset is transformed to pseudo-polar coordi¬ 
nates and the 100 observations with the largest radial component are retained. Parameters are 
estimated through the profile likelihood where the dependence parameter oo is the parameter 
of interest and the degree of freedom u is considered as a nuisance parameter. Parameters are 
estimated for different values of the threshold c = 0,0.02,0.04,0.06,0.08,0.1. In order to com¬ 
pare likelihoods for different values of c, the likelihood functions are evaluated using those data 
points considered to belong to the interior of the simplex, multiplied by the mass at the corners 
and/or edges in proportion to their rescaling constants. 
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Bb " 


0 0.02 0.04 0.06 0.08 0.1 


f 9 S 4 s 


0 0.02 0.04 0.06 0.08 0.1 


0.02 0.04 0.06 0.08 0.1 


Figure 4: Left to right: Boxplots of the estimates of the dependence parameter ut, the degree of freedom 
v and the associated maximum of the likelihood function based on the rescaled angular density, when 
c = 0,0.02,0.04,0.06,0.08 and 0.1. Boxplots are constructed from 500 replicate datasets of size 5000. 
Horizontal lines indicate the true values uj = 0.6 and v = 1.5. 


Figures]^ and [^provide (left to right) boxplots of the resulting estimates of the dependence 
parameter(s) ca, the degree of freedom u and of the likelihood function for increasing values of c, 
for the 500 replicate datasets for both bivariate and trivariate cases. The true parameter values 
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are indicated by the horizontal lines. 
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Figure 5: Left to right: Boxplots of the estimates of the dependence parameter uj = (wi, 2 , ^ 1 ^ 3 , W 2 . 3 )) the 
degree of freedom v and the associated maximum of the likelihood function based on the rescaled angular 
density, when c = 0, 0.02, 0.04,0.06,0.08 and 0.1. Boxplots are constructed from 500 replicate datasets of 
size 1000. Horizontal lines indicate the true values u;i ^2 = 0.6, wi ,3 = 0.7, a; 2,3 = 0.7 and v = 1. 


In the rightmost panel of each Figure, the largest values of the log-likelihood are globally 
obtained for c = 0.02, for which the most accurate estimates of cu and v are also obtained. 
Conditional on c = 0.02 the mean estimates are Co = 0.55 and u = 1.79 in the bivariate case 
and Co = (0.62,0.80,0.71) and i) = 1.27 in the trivariate case. Note that the degree of freedom v 
appears to be slightly overestimated, and appears to be better estimated for slightly larger values 
of c. Overall this procedure appears capable of efficiently estimating the model parameters. Note 
that increased precision of estimates can be obtained by considering a denser range of threshold 
values c. 

An independent study comparing the efficiency of the maximum pairwise and triplewise 
composite likelihood estimators is provided in the Supporting Information. 


5 Application to wind speed data 

We illustrate the use of the extremal skew-t process using wind speed data (the weekly maximum 
wind speed in km/h), collected from 4 monitoring stations across Oklahoma, USA, over the 
March-May period during 1996-2012, as part of a larger dataset of 99 stations. An analysis 
establishing the significant marginal, station-specific skewness of these data is presented in the 
Supporting Information. Here, we focus on the dependence structure between stations, where 
for simplicity the data is marginally transformed to unit Frechet distributions. Only extremal-t 
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and extremal skew-t models are considered, and parameter estimation is performed via pairwise 
composite likelihoods as detailed at the beginning of Section 

Model comparison is performed through the composite likelihood information criterion (CLIC; 


Varin et al., 2011) given by 


CLIC = -2 




where is the maximum composite likelihood estimate of i?, £ 2 ('&',x) is the maximised pair¬ 
wise composite likelihood, and J and H are estimates of J{'&) = Var[/(V1'2(??; ?7)) and H{'&) = 
Kif(—V‘^£ 2 {'&', U))y the variability and sensibility (hessian) matrices, where U is a bivariate ran¬ 
dom vector with extremal skew-t distribution. 


Stations 

Model 

LO 

a 

0 

CLIC 

(CLOU,CLAY,SALL) 

ex-f 

(0.67,0.57,0.69) 

- 

2.89 

5395.73 


ex-skew-t 

(0.42,0.74,0.52) 

(-0.80,2.88,-0.23) 

se: (0.04,0.14,0.03) 

2.06 

5385.07 

(CLOU,CLAY,PAUL) 

ex-f 

(0.59,0.50,0.69) 

- 

2.53 

5503.54 


ex-skew-t 

(0.45,0.29,0.65) 

(-0.68,21.07, 23.41) 

se: (0.05,0.97,1.09) 

2.16 

5496.90 

(CLAY,SALL,PAUL) 

ex-t 

(0.65,0.61,0.53) 

- 

1.55 

5086.13 


ex-skew-t 

(0.56,0.51,0.39) 

(3.55,2.36,8.49) 

se: (0.17,0.15,0.63) 

1.29 

5075.87 

(CLOU,SALL,PAUL) 

ex-t 

(0.37,0.40,0.42) 

- 

1.88 

5428.04 


ex-skew-t 

(0.29,0.30,0.37) 

(-0.14,1.04,34.70) 

se: (0.03,0.02,3.49) 

2.11 

5419.27 


Table 1: Pairwise composite likelihood estimates -d = and £) = (d),Q:, D) of the extremal-t (ext-t) 

and extremal skew-t (ex-skew-t) models respectively, for all possible triplets of the four locations CLOU, 
CLAY, PAUL and SALL. Standard errors (se) are shown for a only. 

Table presents the pairwise composite likelihood estimates of w = (a;i 2 , wis, ti; 23 ), a = 
{ai,a 2 ,oiz) and v for the extremal-t and extremal skew-t models, obtained for all triplewise 
combinations of the four locations CLOU, CLAY, PAUL and SALL. For each triple the extremal 
skew-t model achieves a lower CLIC score than the extremal-t model, indicating its greater 
suitability. Moreover the standard errors of the estimated slant parameters d, clearly indicate 
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that these parameters are non-zero, strengthening the argument of a significantly better fit from 
the extremal skew-t model 

For each location triple {X, Y, Z) we can also evaluate the conditional probability of ex¬ 
ceeding some fixed threshold {x,y,z) using each parametric model. Table presents estimated 
probabilities of the two cases Pr(X > x\Y > y,Z > z) and Pr(X > x,Y > y\Z > z), along 
with the associated empirical probabilities and their 95% confidence intervals (Cl) for a range 
of thresholds. For these specific thresholds, the extremal skew-t model provides estimates of the 
conditional probabilities that fall within the 95% empirical CL However, four probabilities esti¬ 
mated with the extremal-t model are not consistent with the empirical CL This indicates that 
the additional flexibility of the extremal skew-t model allows it to more accurately characterise 
the dependence structure evident in the observed data. 


Threshold 

Extremal-t 

Extremal skew-t 

Empirical (95% Cl) 

Y|y,Z 

0.2587 

0.3268 

0.3752 

0.2686 

0.2737 

0.3305 

0.3356 

0.3150 

0.3333(0.2706,0.3960) 

0.2973(0.2356,0.3590) 

0.2857(0.2247,0.3467) 

0.3333(0.2706,0.3960) 

X,Y\Z {ql%,ql\,qll) 

0.1196 

0.1236 

0.0896 

0.1038 

0.0789 

0.0776 

0.1048 

0.1071 

0.0781 (0.0420,0.1142) 

0.0938 (0.0546,0.1330) 

0.0938(0.0550,0.1326) 

0.0769(0.0415,0.1123) 


Table 2: Extremal-t and extremal skew-t conditional probabilities of exceeding particular fixed thresholds 
of the form Pr(2f > x\Y > y,Z > z) and Pr(2f > x,Y > y\Z > z), along with empirical estimates. The 
windspeed thresholds {x,y,z) are constructed from the marginal quantiles = ((ZcOj' i'CA: 9 saj 9pa) = 
(18.04,20.33,24.18,23.61) and = (gco> 9 ca> 9sa> 9pa) = (22.11,24.33,29.05,28.26) at each location. 


Finally, Figure provides examples of univariate (top panels) and bivariate (bottom) con¬ 
ditional return levels for each triple of sites. The return levels are computed conditionally on 
the wind at the remaining station(s) being higher than their upper 70% marginal quantile. For 
the univariate conditional return levels (top panels), both the extremal-t and extremal skew-t 
model fits are strongly influenced by the windspeed outlier of ~ 40 km/h observed at CLAY 
station (centre two panels). This phenomenon, whereby the far tails of extremal model fits can 


be dominated by a single extreme outlier, is not uncommon in practice (e.g. Coles et ah, 2003). 
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Figure 6: Univariate (top row) and bivariate (bottom) conditional return levels for the triples (left- 
to-right); (CLOU, CLAY, SALL), (CLOU, CLAY, PAUL), (CLAY, SALL, PAUL) and (CLOU, SALL, 
PAUL). Red and blue lines respectively indicate return levels calculated from extremal-t and extremal 
skew-t models. Points indicate the empirical observations and the black dashed lines their 95% confidence 
interval. 

Being the more flexible model, the extremal skew-t model is better able to follow this extreme 
outlier compared to the extremal t. When the outlier is not present (in the two outer panels), 
the extremal skew-t model provides a better visual fit to the observed data and spends more 
time within the empirical confidence intervals, indicating a superior model fit. 

The primary differences in the bivariate conditional return levels (bottom panels. Figure]^ 
are the possibility of asymmetric contour levels under the extremal skew-t model (blue line) 
in contrast with symmetric contours under the extremal-t model (red line). The difference is 
more noticeable in the leftmost and rightmost panel. The leftmost panel indicates lower return 
levels for the extremal skew-t model, which occurs because (CLOU, SALL) have negative slant 
parameters (Table top row) and so the joint tail is shorter than that of the extremal t. 
Conversely, the rightmost panel exhibits larger return levels for the extremal skew-t model, as 
a result of the small negative and very large slant parameters for (CLOU, PAUL) (Table 
bottom row), and so the joint tail is longer than that of the extremal-t. The differences in the 
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centre two panels are less pronounced. For the second panel, the slant parameters of (CLOU, 
PAUL) similarly take a large positive and a small negative value (Table row 2). However, as 
the parameter for CLAY is also a large positive value this means that there is little difference 
between the joint tails of the two models. Finally, for the third panel, the slant parameters of 
(CLAY, PAUL, SALL) are relatively small and positive (Table row 3) and so there is little 
difference between the joint tails of the two models. 

In summary, for these wind speed data, the more flexible extremal skew-t model is demonstra¬ 
bly superior to the extremal-t model in describing the extremes of both the univariate marginal 
distributions, and the extremal dependence between locations. 


6 Discussion 

Appropriate modelling of extremal dependence is critical for producing realistic and precise 
estimates of future extreme events. In practice this is a hugely challenging task, as extremes 
in different application areas may exhibit different types of dependence structures, asymptotic 
dependence levels, exchangeability, and stationary or non-stationary behaviour. 

Working with families of skew-normal distributions and processes we have derived flexible 
new classes of extremal dependence models. Their flexibility arises as they include a wide range 
of dependence structures, while also incorporating several previously developed and popular 
models, such as the stationary extremal-t process and its sub-processes, as special cases. These 
include dependence structures that are asymptotically independent, which is useful for describing 
the dependence of variables that are not exchangeable, and a wide class of non-stationary, 
asymptotically dependent models, suitable for the modelling of spatial extremes. 

In terms of future development, semi-parametric estimation methods would provide powerful 
techniques to fully take advantage of the flexibility offered by non-stationary max-stable models. 
Such methods can be computationally demanding, however. An interesting further direction 
would be to design simple and interpretable families of covariance functions for skew-normal 
processes for which it is then possible to derive max-stable dependence models that are useful 
in practical applications. 

The code used to perform the simulations studies and real data analysis in Section]^ andas 
well as in the Supporting Information, is available in the R (?) package ExtremalDep ([Berang^ 


et al. 2015) available at https ://r-forge. r-project. org/R/?group_id=1998. 
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Supporting Information 


Additional information for this article is available online. 
Description: additional derivations, simulations and figures. 
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A Appendix A: Proofs 


A.l Proof of Proposition 


Items (l)-(3) are easily derived following the proof of Propositions (l)-(4) of Arellano-Valle and 
Genton (|2010) and taking into account the next result. 


Lemma 1. Let Y = {Y^ ,¥ 2 )^ ~ where Vi G M and 1^2 G ^ with the corre¬ 

sponding partition of the parameters {pL, D, v) and k = (ki, O"'")''" with ki G M. Then, 


(V 1 IV 2 = 2 / 2 ) ~ 7'(/ii.2, ki- 2 , 2 ^ 1 - 2 ), 2/2 G ^ 


where ^1.2 = /ri + 1212^22^(1/2 - M 2 ), D1.2 = C2f2ii.2, C2 = + ^2), ^2 = 

^2 ^( 2/2 ~ M2)/P2, W}2 = diag{Q.22)^^‘^, Pll-2 = Pll ~ Dl2D22^^21, ^1-2 = C 2 ^ ^1-2 = n + d — 1. 

Proof of Lemma The marginal density of I 2 is equal to 

/d(2/2)=/ - 

Jo 

namely it is a (d — l)-dimensional central t pdf. The joint density of Y is equal to 


v/2-l^-v 



2 ^ xU - 1)/2 


dn = V’d- 1 ( 2 / 2 ; M 2 , P 22 ,i^), 


fY2{y2)fYi\Y2=y2iy^) 



□ 
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A.2 Proof of Proposition 

Let Z ~ ST{a, r, k, u). Then 1 — 'L(x; a, r, v) k. a, r, z/) as x —>■ +oo, for any u > 1, 

where 

r{(z/ + l)/2}'h(a\/z7+T; v + 1) 


^[x\ a, r, K, v) = 


_|_ q, 2 . -|- q, 2^ 


O ”1” 


is a slowly varying function (e.g de Haan and Ferreira, 2006 Appendix B). From Corollary 


1.2.4 in de Haan and Ferreira (2006), it follows that the normalisation constants are an = 


— 1/n; a, r, K, z/), where T"*” is the inverse function of T, and bn = 0, and therefore 
On = {'n'.Sf{a,T,K,u)}^/’', where J!f{a,T,K,u) = .if (oo; a, r, zt, z/). Applying Theorem 1.2.1 in 
de Haan and Ferreiraj ( |2006 ) we obtain that Mn/an => U, where U has zz-Frechet univariate 
marginal distributions. 

Let Z ~ STd{^,a,T, K,iy). For any j G d} consider the partition Z = {Zj,Zj,)'^, 

where Ij = {l,...,d}\j and Zj = Zyy, and the respective partition of (H,a). Define an = 
(onp,..., an,d), where = {nif(a*, r*, and a* = t* = and k* = are 

the marginal parameters ([^ under Proposition [T]Q. From Theorem 6.1.1 and Corollary 6.1.3 in 
de Haan and Ferreira ( |2006 ), Mn/an => U, where the distribution of U is G{x) = exp{—H(x)} 
with V{x) = lim„^+oo n{l - Pr(Zi < OnpXi, ...,Zd< an,dXd)} for all x = {xi,.. .,Xd)~^ G M+. 


Applying the conditional tail dependence function framework of Nikoloulopoulos et al. (2009) it 
follows that 

d 

V{xj,i G I) = lim ^x“''Pr(Zi < an,iXi,i G Ij\Zj = anjxj). 
i=i 

From the conditional distribution in Proposition [T]Q we have that 

Zi an.-jX 




{inja - 


,i ^ Ij 


\Zj — anjXj > ~ 


'STd—l f Pj , Olj , TnJ , l^n,j j ^ T 1 j , 


for j G ... 1,..., d, where = diag(D/./^..j)V2, - 

{o.njXj') \j{y T 1), Fnj — A ^j')^n,jXj ~\~ x\/(/^ 


n,3 


1 /2 

and Kn,j = i^/Cn j ■ Now, for any j G {1,..., d} and all i G Ij 

(x+/x+ -Wij)(z/ + 1)1/2 


as n —)■ + 00 , 


{(1 

where ojij is the (i,j)-th element of D, xl = {a*,T*, k*,!^) and Tnj —)■ tI = {Clji.ai- + 

aj){v + 1)^/^, and Kn,j —)• 0 as n —)• +oo. As a consequence 


Vix„j€l) = Y^ 


= > Xj’^^d-l 


i=i 



zz + 1 




— UJ. 




,i £ Ij 
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A.3 Proof of Proposition 


Recall that if Z ~ 5A/2(P,a), then Zj ~ SJ\f{a*) and ~ SJ\f{aj. 3 -j) for j = 1,2 (e.g. 

Azzalini (2013 Ch. 2) or Proposition!^, where 


= 


Oj + uja^—j 




a 


i-3—i — (Xjy/l 




Define Xj{u) = $■*“(! — u;a*j), for any u G [0,1], where <!>■*“(•; a*) is the inverse of the marginal 
distribution function ‘h(-; ap, j = 1, 2. The asymptotic behaviour of Xj{u) as tt —>■ 0 is 


Xj{u) = 


x{u), 


if a* > 0 


(23) 


x{u)/aj - {2\og{l/u)} log(-v/7rap, if a* < 0 
for j = 1,2, where oij = {1+a*^}^/^ and x{u) ^ {2 log(l/tt)}^/^ —{2 log(l/n)}“^/^{loglog(l/u) + 


log(2T/^)} (Padoan, 2011). The limiting behaviour of the joint survivor function of the bivariate 
skew-normal distribution is described by 


p{u) = Pr{Zi > xi(u), Z 2 > X 2 (u)}, u —)■ 0. 


(24) 


For case (a), when ai, 02 > 0, then xi{u) = X 2 {u) = x{u), and the joint upper tail (24) behaves 
as 


P{u)=f |l - $ ) !> (/)(u;a2)du 

Jx{u) I 


Vi — 






x{u) 


10 


(j) 2 ix{u),x{u) + t/x{u)] P, a] 
x{u){l — oj) — 0 jt/x{u) 


dt 


^-X^{u)/{l+uj) 


7r(l — io)x‘^{u) I Jq 

g-a;2(n)/(l+a;)(i _^^) 

7r(l — oj)x{u)‘^ 




—x'^{u){ai+a 2 )'^/2 roc \ 

J: _ / g-t{l/(l+‘*^)+«2(ai+«2)}^^ j 

y/^{ai + a 2 )x{u) Jo J 


1 - 


^-x^{u){ai+a2)^/2 


V^{ai -I- q:2){1 -I- 02 ( 0:1 + 02)(1 + 0 j)}x{u) j 


(25) 


as u —>■ 0. The first approximation is obtained by using 1 — <h(x;a) ss (/>(x;o)/x as x —>■ -|-oo, 


when o > 0 (Padoan, 2011). The second approximation uses 1 — <I>(x) ss 4>{x)/x as x —)■ -|-oo 


(Feller, 1968). Let Xj = {—1/log‘l>(Zj; o^}, j = 1,2. Substituting x(u) into (25) substituting 
and using the approximation 1 — Pr(Aj >x) ~ 1/xasx—>■ 00 , j = 1, 2, we obtain that (24) with 


common unit Frechet margins behaves asymptotically as .jZ’(x) x 2 /( 1 +^*^)^ ^s x —)• -|-oo, where 

(26) 


2(1-I-a;)(47rlogx) / (47rlogx)^^"^"'““2^^ ^^/^x (“i+" 2 )^ 

.if(x) = -:- I 1 — 


1 — UJ 


(oi -|- 02){1 + 02(01 02)(1 -|- cu)} 
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As the second term in the parentheses in ( 26 ) is then the quantity inside the 


parentheses —)• 1 rapidly as x —?• oo, and so ^{x) is well approximated by the first term in ( 26 ). 
When q;2 < 0 and ai > —a2loJ-, then 0^,02 >0 and we obtain the same outcome. 

For case (b), when 02 < 0 and —u, 02 < ai < —uj~^a2, then > 0 and < 0 and hence 
xi{u) = x{u) and X2{u) ^ x{u)/a2 as n —)■ 0 . When ai > —q; 2«2, then following a similar 


derivation to those in ( 25 ), we obtain that 
0:2(1 ~ “ ^02)“^ 


p{u) 


7 r(a 2 — uj)x‘^{u) 


exp 


x^{u) \ I — up' + (02 — 
(l-a;2)di 


as u 


0 . 


Similarly, when oi < —0202, and noting that <h(x) « —(^{—x)lx as x —>■ — 00 , then 


( N -« 2{1 “+ a 2 (a 2 + aia 2 )(l - ^ —2 

7 r(d 2 - a;)(l - w 2 )-i(oi + 02 /d 2 )x 3 (n) 

For case (c), when 02 < 0 and 0 < oi < —a;o2, then 0^,02 < 0 and hence xi(n) ss x(n)/di and 
X2(u) « x(u)/a2 as tt —)■ 0 . Then as u —>• 0 we have 

—d 2 '^^df(l — w^)(d 2 — ujai)~^(aia2 + 0201)“^ 


M{. 


( 1 — 




n. 


as ti —>■ 0. 


p{u) 


X exp 


7 r{l - uja2 + 02(02 + oi02/oi)(l - u}‘^)}x^{u) 
x‘^{u) 


2(1 -a;2) 


of(l —a;^) + l o|(l —a;^) + l 2(0102(1 — — cu) 

^ ^(2 I 


^,2 


o 


o 


0102 


u 


0 . 


When 0i,02 < 0 and UJ2 ^02 < oi < 0 the same argument holds. Finally, interchanging oi with 
02 produces the same results but substituting aj and aj with a^-j and 03-j respectively, for 
j = l,2. 


A.4 Proof of Theorem [T] 

Let Y(s) be a skew-normal process with finite dimensional distribution 5 A/rf(il, o, r). For any 
j £ I = {l,...,d} consider the partition Y = ( 1 ^-,where Ij = I\j, Yj = Y^jj = Y{sj) 
and Y/^, = {Yi,i G Ij)'^ , and the respective partition of (li,o). The exponent function HI is 


V{xj,j G I) = E 


max ' 




TTL- 


{yj/xjf 


= I max< ^ ,0} (f)diy,^;a,T)dy, 

3 mj 


where Xj = x{sj), yj = y{sj) and mJ = m'^{sj). Then 

^ 1 /■“ 


Y(x,-,jG/) = Y, 


i=i 


Vl 
Jo \Xj 


4 >d{y;^;a,T)dyi^dyj, ( 27 ) 
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where xj. = {xi,i G Ij)' and yi- = {yi,i G Ij)' . As 1^- ~ SJ\f{a*,T*), where a* = and 
3 ‘{j} 

'^t ^ J '^j )dyj = 


Tj = T'f-^ are the marginal parameters derived from Proposition jljQ , then 


1 


Vj 4 >{yj)^{a*yj +T*)dyj 


<f>{r;(l + af)-V2}yo ^3 

2('^-2)/2r{(z. + l)/2}^'(a*v^i7Tl; -t*, iy + 1) 


V 7 r$[r{l + < 3 ^( 0 :)} V 2 ] 

by observing that rt{l + = 7-{i -g QQ{a)}~^/‘^. 

For j = 1 ,... ,d define x° = Xj{rti^Y/'^ and m'^ = /< 1 >[t {1 + where 

(7r)^/22('^-2)/2p|(’jy _l_ \y 2 }'^{a*^/v + 1 ; + !)■ Then, for any j = 1 ,..., d 

ryjxj./xj 

(j)d{y,^,a,T)dyi.dyj 

(j)d{y, n)$(a’^y + T)dyi.dyj 

4>d-i{yij - Vj^jjj ; + T)dyi^dyj 


m: = 



4>{yj)^d {vj'^^T) '^y: 


'Jj 


where 


yi = 




2^a*+r; 


with = diag( 0 /^py)^/ 2 ^ = ^ijij - y i%b > + Tj = 


yjiaj+n^^-njl ai )+T 


i - 'J - |i+gn," .(«,))■'= 




{ 






^}.T . T . . diJ T \ OLT - \ 


{l+Qn,. 


T 




/ 


. J . .dU) T \ O'T 

wViprp 0° = I,1~3 Or T -i.O^ anri j j bb'-^ J _ ~'J 

bb'i bb'i {3+Qnj_j..AaiAV^^ {3+Qn°(‘^ijij-j‘^ij'>y^^' 

_ _ J 3 3 J j J J J 

Applying Dutt’s (Dutt, p!973 ) probability integrals we obtain 



Hyj)^d{y]-,^T)^y 3 ^ 

+ 1; -T*,u + 1) 
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This is recognised as the form of a (d— l)-dimensional non-central extended skew-t distribution 
with v + 1 degrees of freedom (Jamalizadeh et al. 2009), from which Vj can be expressed as 


u = W. 



u + 1 


x7 




- w, 






for j = where a° = t° = {Qji-aj. + aj){v + 1)^/^ and k° = -{1 

Qui I Substituting the expression for Vj into (27) then gives the required the 


exponent function. 
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B Supplementary material for ‘Models for extremal dependence 
derived from skew-symmetric families’ by B. Beranger, S. A. 
Padoan and S. A. Sisson 

This document/appendix contains technical details for deriving the bivariate, trivariate and 
quadrivariate densities of the extremal-skew-t model described in the paper, some graphical 
illustration and simulation results for the extremal-t process. 


B.l Plots of the angular density of the extremal-skew-t model 



Figure 7: Trivariate extremal skew-t angular densities with v = i degrees of freedom. Correlation 
coefficients are uj = (0.6, 0.8, 0.7)^ for the top row and w = (0.7,0.7, 0.7)^ for the bottom. From left to 
right the skewness parameters are a = (0,0,0)^, a = (—3,—3, 7)^ and a = (7,—10,3)^. In all cases 
r = 0 for simplicity. 


Figure [^illustrates some examples of the flexibility of the trivariate extremal-skew-t depen¬ 
dence structure. Here we write the correlation coefficients as a; = (wi^ 2 ) ‘^ 2 , 3 )"'" and the slant 

parameters as a = (ai, 2 , 01 , 3 , 02 ,s)"*", and assume that v = 3 and r = 0 for simplicity. 

The plots in the left column have a = (0, 0, 0)^ and so correspond to the extremal-t angular 
measure. The density in the top-left panel, obtained with w = (0.6,0.8, 0.7)"'", has mass concen¬ 
trations mainly on the edge that links the first and the third variable, since they are the most 
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dependent = 0.8). Some mass is also placed on the corners of the second variable, indicat¬ 
ing that this is less dependent on the others {wi ^2 = 0.6 and W 2,3 = 0.7), and on the middle of 
the simplex, because a low degree of freedom {u = 3) pushes mass towards the centre of the sim¬ 
plex. The top-middle and top-right panels are extremal skew-t angular densities obtained with 
a = (—3, —3, T)"*" and a = (7, —10, 3)"'" respectively. Here the impact of the slant parameters is 
to increase the levels of dependence - indeed the mass is clearly pushed towards the centre of the 
simplex. In the middle panel dependence between the second and third variables has increased, 
while in the right panel all variables are strongly dependent with a greater dependence of the 
second variable on the others. 

The bottom row in Figure illustrates the spectral densities with correlation coefficients 
oj = (0.7,0.7, 0.7)"'^. The bottom-left panel is the standard extremal-t dependence (with a = 
(0,0,0)"'^), which has a symmetric density with mass concentrated mainly in the centre of the 
simplex and on the vertices. The bottom-middle and bottom-right panels show extremal skew-t 
densities, obtained with a = (—3, —3,7)"'" and a = (7, —10, 3)"'" respectively. In this case the 
impact of the slant parameters is to decrease the dependence - here the mass is pushed towards 
the edges of the simplex. In the middle panel the first and second variables have become 
less dependent from the third variable, more so than between each other. In the right panel 
the first and third variables are less dependent on the second. These examples illustrate the 
great flexibility of the extremal skew-t model in capturing a wide range of extremal dependence 
behaviour above and beyond that of the standard extremal t model. 

B.2 Display of the partitions of the three-dimensional simplex 

Figure [^displays the partitions of the three-dimensional simplex into three vertices (grey shad¬ 
ing) , edges (line shading) and the interior (no shading). Observations where angular components 
fall into such areas are considered to belong to the corresponding subset of the simplex (vertex, 
edge or interior). 

For example, when > 1 — c (on the left of the green dashed line indicating the 1 — c 
level for W 3 ), then w = {wi,W 2 ,W 3 ) is in the corner associated with the third component, which 
corresponds to the grey shaded triangle on the bottom left of the simplex. Similarly, if both 
wi and W 2 are less than 1 — c (i.e. to the left of the blue dashed line indicating the 1 — c level 
of wi and below the red dashed line indicating the 1 — c level of W 2 ), such that tci > 1 — 2 w 2 
and W 2 > 1 — 2wi (i.e. to the right of the black dashed line bisecting the corner of the second 
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component and above the black dashed line bisecting the corner of the first component) and if 
W 3 < c (to the right of the green dashed line indicating the 1 —c level of W 3 ) then w = {wi,W 2 ,W 3 ) 
is on the edge between the first and second component. This is indicated by the line-shaded 
area on the right hand side of the simplex. Finally if wi,W 2 ,W 3 > c (i.e. to the right of the 
blue dashed line, above the red dashed line and to the left of the green dashed line, respectively 
indicating the c levels of wi,W 2 and W 3 ) then w = (rci, u) 2 , rcs) is in the interior of the simplex, 
represented by the white triangle in the centre of the simplex. 


B.3 Computation of d-dimensional extremal-skew-t density for d = 2,3,4. 

For clarity of exposition we focus on the finite dimensional distribution of the extremal-t process, 
denoted by G. We initially assume that a = 0 and r = 0 in (15) of the main paper (focusing on 
(16)), and relax this assumption later. For brevity the exponent function is written as 


V (xj ,j e I) = ’ Tj = ^d-i {uj ; 0 °, 1 / + 1 ) 

16 / 




where I = {1,..., d}, uj = 
successive differentiations the 2-dimensional density {d = 2) is 


T 


and where Ij = I\{j}- By 


f{x) = {-Vi 2 + ViV 2 )G{x), 



Figure 8; Partitions of the three-dimensional simplex 
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the 3-dimensional density {d = 3) is 


f{x) = (-^123 + V^lV^23 + h^2l^l3 + V^3hi2 " ViV 2 V:,)G {x), X G 


and the 4-dimensional density {d = 4) is 


f{x) — ( —Vi 234 + ^1^234 + 1^2^134 + 1^3^124 + hi 2 V 34 -|- V 13 V 24 + ^14^23 

- ^^^ 2^34 - V1F3V24 - V1V4V23 - V2V3VU - V2V4V13 - V3V4V12 

+ ViV2V3V4)G(x), X G 

where := ^ derivatives of the exponent function are given by 

d™“^r,4 


d 


V- • 

4 dxi, •••dxi 




- y X, 


-2 


dxi^ ■ ■ ■ dx,:^ dx,;«,, • • • dx,: 


-'•i -‘m -li - 

In particular, when m = d \t follows that {ii,..., ij} = {1,..., d} and that 

d 


Vi...d = -{yxi) (ui; ni,u + l)Y\J i-u^ 


12+1 ( Xi 

x\ 


i-i 


(28) 


When d = 2 or 3, the derivatives of T,-, for j G I are given by 


dTo ^ d ! p\o dwp,?' 

— “ E {uj;nj,i 2 + 1 ) —, 


dxu du. 

p 


d^Ti 


d-l 


dxj^dxij ^ \dupj 
d-2 d-l 


E (^ + 1 ) + ^^d-i {uf, 12 + 1 ) 


(29) 


p,j 


dxi^ dxi2 


+ EE 


,=4,=p+l 


{uj; Q.°,i 2 + 1) 


dupj duqj dUpj dUqj- 

dxjj dxij dxjj dxj^ 


(30) 


where Upj is the p-th element of uj, and when d = 3 


d^T. 


2 3 

= EE 

dxiidxi2dxi3 


/ 


V 


_.T, (,,..n° ,, 111 dupj ^ dugj Xupj 

, + ) E dxi^ dxi dxi, dxi,, dxi dx^ 


dUpjdUq^j 


r,s,tGl 


3 3 


I .T, 4 , .00 ,, I I'l ’sW dwp,j dupj dugj 

r,s,t&I 

r^s^t 


dv? du .^d-i\u,j,^Lj, 

P=1 q=i 

qj^p 


dxi^ dxi^ dxi^ 


+ 


duijdu2,jdu3j 


'i>d-i{uj;n°,i2 + l) ^ 


r,s,t&I 

r^s^t 


duij du2,j du3j 
dxj^ dxj^ dxi^ ' 


(31) 
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We provide the derivatives of the d-dimensional t cdf below. When d = 1 and for all x G 

d T ^ \ / / \ T / N (^ + 1 ) 2 ^ / / N 

—^(x; I/) = V’Cx; z/), v) = u), 

d^ {v + l){x‘^ - V + {u + l)x‘^) 

^^{x;iy) = - . ^ 2^2 - 

dx'^ [u + x^^y 

When d = 2 and for all x G Mi, 


-— ^2{x-., v) = y{xi] (x2.i; i^ + l), 

dxi 

n, u) = -y{xi-u) I ^ (^ 2 . 1 ; y + l) + i ) V' + 1 ) 


1 — uj^ V (^ + x\yi‘^ 


- —-— ^2{x-., v) = y2{x; v), 

dxidx2 

■<,fVioT.o 1,. . — / V+1 Xj-UJjjXl ^ ^ T n ^ T. 

wnere Vr.j - ^ ’ 3 ^ ^ ^ 3 ■> 

- f (^ + lyxi — {ly + l){iy — xf) 

^«'2(x; n,iy) = -^ {V2.i;v + 1) V'(xi; u) | ^- 

+ y{v2.i;i^ + 3)y{xi-,iy)J 

V \ — UJ‘^ {v + x\Yi 

X {xi{ujiy + X2Xi){2iy — 1) — X2(i^ + xf) 

(uj(iy + xf) + (x 2 — ujxi)xi) (ly + 2 )(x 2 — u>xi)(u>iy + X 2 X 1 ) 1 

(:^ + x2)(l - w2) + (X 2 - a;xi)2 J’ 


dx2dX2 ^ 27r^(l - w2)3/2 

When d = 3 and for all x G Mi, 


v(l — w2) 


^4'3(x;n,i/) = il}{x;v)'^2 |(x2.i, xs-i)"^; 1 /+ l| , 


i^d' 3 (x;n,i^) = [W + l)a;i X ^'2 {(?;2 i,W 3 i)^;f^ij^ + 1} 


dx{ 


h' + XI 


+ y{v2-l]V + 1) 


' V + 1 X 2 X 1 + a;i2J^ 

1 “ ^12 ^Jv + x\ 


X d/ 


^/v + 2 {(X3 - Wi3Xi)(l - UJI 2 ) - (w23 - W12W13)(X2 “ W 12 X 1 )} 

^/{(l - W^2)(^ + xl) + (X2 - Wi2Xi)2} {(1 - Lol^yi - W13) - (W 23 “ Wi 2 Wi 3 ) 2 } 


\iy + 2 


+ V’(v3-i;i^ + l) 


' V + 1 X 3 X 1 + Wi 3 ^ 
1 “ x’13 \/iy + xf 


X 'll 


\/iy + 2 {(X2 - Wi2Xi)(l - Lul^) - (W 23 - W 12 W 13 )(X 3 “ W13X1)} 

V{(1 - ‘^13)(^ + xl) + (X3 - Wi3Xi)2} {(1 - - (^23 “ Wi2Wi3)2} 


; z/ + 2 
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Axidx[ 


dx^dx2 


- 4 ' 3 (x; ft, ly) = ijy{x 2 ] i')i’ (1^1.2; v + 1 ) 


iy + 1 


X d/ 


[l - ujl2){u + xl) 

+ 2 {(a;3 — W23a;2)(l ~ <^^2) ~ (<^13 ~ Wi2W23)(a^i ~ ^12X2)} 

^{(1 - ujl2){v + x\) + (a;i - ^122:2)^} {(1 - wj2)(l “ ^23) “ (‘^is - ^12^23)^} 



^3(3;; v) = --tpixs; iy)'>p (vi.3; !y+l)^ 


iy + 1 


{v + 2)(xi — ^ 122 ^ 2 ) 


X v]/ 


V (1 -‘*^i3)(*" + a^3) L(1 -‘*^i2)(i" + a^2) + (a^i - wi2a;2)2 
\/v~+2 {(a;3 - a;23a;2)(l - W12) - (a;i3 - wi2W23)(a;i - a;i2a;2)} 


X f/) 


\/{{l- Ujf 2 ){t^ + xl) + {x- ^ 12 X 2 )^} {(1 - w? 2 )(l - ^*^ 23 ) “ (^*^13 - Wl2a;23)^} 

_ + 2(1 - Ujf 2 ) _ (a;i3 - 07120^23) - {xi - UJi2X2)ix3 - UJ 23 X 2 ) 

\/(l - “^ 12 )(1 “ “^ 23 ) “ (“^13 - Wi2a;23)^ {(1 - w^2)(^ + 2 ^ 2 ) + (a^i “‘^ 122 ^ 2 )^}^^^ 

+ 2 {(a;3 - u;23X2)(l - W 12 ) - (wi3 - Wi2a;23)(a;i - ^ 12 X 2 )} 


ly + 2 


\/{(l - ‘^12)(^ + ^1) + (^1 - ^12X2)^} {(1 - w^2)(l - ‘*^23) “ - Wi 2 a; 23 )^} 


',iy + 2 
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+ 'i> 


i^ + xi) 

y/i' + 2 [(xa — a;i3a;i)(l — UJ 12 ) — (^23 ~ ^I2^i3){x2 — ^^ 122 ^ 1 )] 

■^/[(l - w^2)(^ + xf) + (X2 - UI12X1Y] [(1 - a;?2)(l - wfa) - (^23 - ^12^13)^] 

V + l 2 {x 2 Xi + UJi2v){v + 2 )xi — v{x 2 — UJ12X1) 


; + 2 


X V’(w2-i;j^ + 1) 


1 Ujf2 


[v + x\Y/'^ 

{v + 2 )(x 2 — uJi 2 Xi)\/iy^Pi{x 2 Xi + a;i 2 J 2 )^ 
a /1 - Ujl2{l' + xly/"^ ((1 - Ujl2){v + x\) + {X2 - UJ12X1Y) 




y/v + 2 [(X2 - a;i2CCi)(l - ujI^) - (0^23 - ^i2C^i3)(a^3 - uJi^xi)] 


V[(l - Wi 3 )(^ + xj) + {X 3 - Wi 3 a;i) 2 ] [(1 - a;j2)(l “ ^13) “ (^23 - Wi 2 a;i 3 )^] 


'u-\-2 


IJ+l 2{x3Xi+UJi3l2){l2 + 2)xi-12{x3-0Ji3Xi) 
(l^ + 2 ){x3 - LOi3Xi)^/17^n{x3Xi + UJi3v)'^ 


+ Ip 


\/l - ((1 - a;^3)(z2 + x\) + {X3 - 1x13X1)'^) 

a/z7T2 [(x 3 - a;i3Xi)(l - ^^2) “ (^23 - uJi2UJi3)ix2 - Wi2a;i)] 


-^[(1 - a;f2)(^ + 2^1) + ix 2 - UJ12X1Y] [(1 - a;^2)(l “ ‘^la) “ (^^23 - ^120213)2] 
X 'lp{v2.l]V + 1) 


\u -\- 2 


(1 - w^2)(^ + 2 ) 


(1 — a;j 2 )(l ~ ^ 13 ) ~ {^"23 — ^ 12 ^ 13 )^ 


y/u + l{X2Xi -\-OJ12v) 


yjv + xl{{l - UJi2)i’^ + xj) + {X2 “ Wl2a;i)2)3/2 


((1 - Ujl2){l' + xl) + {X2 - UJi2Xif) 


1X23 — ^ 12^13 \ I / \ ^23 — Wi2a;i3 , . , , , 

X 1 W12-;-2-W13 - {X3 - W13X1) ---2- \X2 - LO12X1) ) {Xl - UI12X2) 


^-^12 J \ 1 - 

'^v + 2 [{X2 - uii2Xi){l - UJI3) - (W23 - Wi2a;i3)(a;3 - wias/] 

\/[(l - ^I3)i^ + xl) + ix3 - uJisXi)^] [(1 - a;^2)(l “ ‘^la) “ (^^23 - ^12^13)2] 

X ip {V3.1; 12 + 1) 


+ ip 


\u -\- 2 


(1 - io\3){v + 2) 


(1 — a;22)(l — L 0 I 3 ) ~ (^23 — ^ 12 ^ 13 )^ 


\/v~^{X3Xi + LO13V) 


ijv + xl{{l - U;^3)(j2 + xl) + {X 3 - Wi3Xi)2)3/2 _ 


((1 - Wi3)(!2 + xl) + (X3 - UJi3Xi)^) 


1 — (jJ 


13 


U >23 — ^ 12 <X ’13 \ I / \ '^23 — Wi 2 a;i 3 , , , , . 

^ I ‘X^i 3 —;--a;i 2 ) - [ {X2 - UJ12X1) - iZrpj2 - (3:3 - UJ13X) ) (a;i - a;i 3 a; 3 ) 


13 


Combining the derivatives of the t cdf with equations (28)-(31) provides the full d-dimensional 


densities of the extremal-t process. Returning to the extremal skew-t case (i.e. when a / 0 and 
T / 0), it is sufficient to consider the following changes. Firstly, rewrite 




U4 


-5, 


Tj = 


-^J 


1 


,12 + 1 


^-1 {Tj-,iy+l) 


3 e I, 
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where Un = 


v+l 
l-a;2 . 


l/y 


,i ^ Ij 


T 


following Definition 1 of the main paper. It 


can then be shown that 




+ 

ml 

mt 


i=2 

following Theorem 1 of the main paper. Note that equations 
through the redefinition of d <— d + 1 and uj <r- {uj,fj)~^. This in combination with the above 
derivatives of the t cdfs leads to the d-dimensional densities of the extremal-skew-f process. 


-(31) are still valid in this case, 


B.4 Composite likelihood simulation study 

We compare the efficiency of the maximum triplewise composite likelihood estimator with that 
based on the pairwise composite likelihood, discussed in Section 4 of the main paper, when data 
are drawn from an extremal-t process. We generate 300 replicate samples of size n = 20, 50 and 
70 from the extremal-t process with correlation function (10) in Section 2.2 of the main paper, 
with varying parameters, over 20 random spatial points on S = [0,100]^. Table presents 
the resulting relative efficiencies RE^/RE\/RE(^x,i) (xlOO), where RE^ = var(^ 3 )/var(^ 2 ), 
REx = vaf(A 3 )/var(A 2 ) and RE(^x,i) = cov(A 3 ,^ 3 )/cov(A 2 ,^ 2 )) where {Xmiim) are the m-wise 
maximum composite likelihood estimates (m = 2,3), and var and cov denote sample variance 
and covariance over replicates. Perhaps unsurprisingly, the triplewise estimates are at worst 
just as efficient as the pairwise estimates {RE < 100) but are frequently much more efficient. 
However this is balanced computationally as there is a corresponding increase in the number of 
components in the triplewise composite likelihood function. For each v, there is a general gain in 
efficiency when the smoothing parameter ^ increases for each fixed scale parameter A. There is 
a similar gain when increasing A for fixed These gains become progressively pronounced with 
increasing sample size n, and when there is stronger dependence present (i.e. smaller degrees 
of freedom v). However, we note that there are a number of instances where the efficiency gain 
goes against this general trend, which indicates that there are some subtleties involved. 

B.5 Marginal analysis of wind speed data 

The maximum daily observations of wind speed (1564 observations per station) are considered for 
each of the 4 monitoring stations CLOU, CLAY, SALL and PAUL. The t and skew-t distributions 
are fitted to the data using the maximum likelihood approach and a chi-square test is performed 
in order to investigate wether the slant parameter of the skew-t distribution is significantly 
different from zero. Additionally the Fisher-Pearson coefficient of skewness ( 7 ) is calculated. 
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Figure 9: Histogram of daily windspeed data, fitted t (red solid line) and skew-t (blue solid line) densities 
for each of the four stations CLOU (top-left), CLAY (top-right), SALL (bottom-left) and PAUL (bottom- 
right) . 


The marginal estimation results are collected in Table The estimated parameters are 
location /x, scale a and degrees of freedom ly for the t distribution and in addition the slant a 
for the skew-t distributions. The Table also displays the p-value of a chi-square test of a = 0 for 
each station. With a p-value of effectively zero, the marginal skewness of the data is established 
for each station. 

The red and blue solid lines in Figure respectively show the fitted t and skew-t densities 
compared to the histogram of the daily observations for each of the four monitoring stations. 
Each of the plots clearly shows that the datasets are right skewed and that the model with the 
ability to handle skewness provides a better fit. 
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V 

= 1 



n = 

20 





A\e 

0.5 

1 

1.5 

1.9 

2 

14 

89/94/89 

84/97/93 

83/69/79 

81/82/84 

78/64/72 

28 

76/100/98 

59/100/69 

73/86/73 

74/66/75 

34/75/26 

42 

81/100/100 

51/96/89 

51/80/88 

43/63/79 

33/51/72 

n = 

50 





A\e 

0.5 

1 

1.5 

1.9 

2 

14 

85/81/84 

87/78/86 

76/67/78 

66/56/72 

52/47/62 

28 

64/100/81 

81/79/82 

73/72/78 

72/66/74 

34/68/24 

42 

71/100/97 

33/61/59 

17/42/40 

17/34/37 

2/18/7 

n = 

70 





A\e 

0.5 

1 

1.5 

1.9 

2 

14 

80/87/83 

81/76/80 

74/65/77 

62/57/70 

47/42/60 

28 

51/100/68 

82/82/84 

72/72/77 

71/66/73 

54/53/62 

42 

56/93/89 

28/52/48 

13/40/14 

12/28/27 

8/23/26 

n = 

20 

V 

= 3 



A\e 

0.5 

1 

1.5 

1.9 

2 

14 

93/100/96 

93/96/91 

88/84/83 

84/83/84 

78/77/82 

28 

86/100/100 

72 l<d 7 l 7 b 

90/91/89 

87/85/86 

39/78/50 

42 

78/100/100 

72/97/100 

58/71/74 

51/68/95 

44/58/84 

n = 

50 





A\e 

0.5 

1 

1.5 

1.9 

2 

14 

91/85/89 

92/89/92 

86/81/88 

82/78/86 

64/64/74 

28 

70/100/81 

74/87/63 

83/81/84 

80/74/82 

77/75/81 

42 

69/100/100 

47/70/75 

36/53/64 

30/40/61 

38/32/33 

n = 

70 





A\e 

0.5 

1 

1.5 

1.9 

2 

14 

93/93/94 

89/88/87 

81/77/85 

81/74/84 

58/58/71 

28 

94/94/94 

85/87/89 

81/77/86 

79/75/82 

81/77/84 

42 

65/94/95 

44/57/62 

29/45/49 

25/35/50 

20/28/38 


Table 3; Efficiency of maximum triplewise likelihodiSestimators relative to maximum pairwise likelihood 
estimators for the Extremal-t process, based on 300 replicate simulations. Simulated datasets of size n = 
20, 50, 70 are generated at 20 random sites in S = [0,100]^, given power exponential dependence function 
parameters = (A,^). Relative efficiencies are RE^/REx/RErx,n (xlOO) where RE^ = vaf(^ 3 )/vaf(^ 2 ), 



Station 

Model 

A 

a 

a 

0 

p-value 

7 

CLOU 

t 

11.84 

2.75 

- 

5.78 

- 

- 


skew-t 

8.51 

20.24 

2.79 

11.21 

0 

1.17 

CLAY 

t 

12.63 

3.50 

- 

6.40 

- 

- 


skew-t 

8.23 

35.53 

3.28 

16.61 

0 

1.12 

SALL 

t 

14.66 

4.27 

- 

7.47 

- 

- 


skew-t 

9.02 

58.76 

4.20 

50.98 

0 

0.92 

PAUL 

t 

15.76 

4.25 

- 

9.31 

- 

- 


skew-t 

11.43 

38.55 

1.78 

17.81 

0 

0.79 


Table 4: Outcome of the marginal analysis of the four stations. 
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